In the following exercises we will collect and explore some data from YouTube.
Before we start with this exercise, three short notes on working with the exercise files in this workshop:
You can find the solutions for this exercise as well as all other exercises in the solutions folder in the repo/directory that contains the course materials. You can copy code from these exercise files by clicking on the small blue clipboard icon in the upper right corner of the code boxes.
We would like to ask you to solve all coding tasks by writing them into your own R script files. This ensures that all of your solutions are reproducible, and that you can (re-)use solutions from earlier exercises in later ones.
All exercises and their solutions ‘assume’ they are in the solutions folder within the folder that contains the materials for this course. This way they can make use of files in other folders using relative paths. In order for your scripts to run properly, we suggest that you create (and save) them in the my_code folder (which already includes an almost empty script that you can continue to work with). For the relative file paths to work, you will also need to set your working directory to the folder that contains the script. Otherwise, you may have to change the (relative) file paths in your scripts accordingly.
Now let’s get to it…
yt_oauth from the tuber package which requires the ID of your app as well as your app secret as arguments.
While going through the following exercises you might want to monitor your API quota usage via the Google Cloud Platform dashboard for your app: Select APIs & Services -> Dashboard.
get_channel_stats function which requires the ID the channel (as a string) as its main argument.
get_channel_stats("UCiQ98odXlAkX63EaFWNjH0g")
get_stats and need the ID of the video.
get_stats("uHGlCi9jOWY")
comments_lwt_census.
get_all_comments.
comments_lwt_census <- get_all_comments("1aheRpmurAo")
NB: If you check the comment count on the website of the video you will see that there are more comments than in the dataframe you just created. This is because get_all_comments from tuber only collects up to 5 replies per comment.
base R function saveRDS. For those, you should create a data subfolder in the folder that contains the workshops materials. The code in the solution assumes that your working directory is the my_code folder (or some other subfolder within the one that contains the course materials) and stores the file in the data folder that you should have (created).
saveRDS(comments_lwt_census, "../data/RawComments.rds")